Skip to content

Binary patching rust hot-reloading, sub-second rebuilds, independent server/client hot-reload #3797

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 305 commits into from
May 7, 2025

Conversation

jkelleyrtp
Copy link
Member

@jkelleyrtp jkelleyrtp commented Feb 25, 2025

Inlines the work from https://github.com/jkelleyrtp/ipbp to bring pure rust hot-reloading to Dioxus.

fast_reload.mp4

The approach we're taking works across all platforms though each will require some bespoke logic. The object crate is thankfully generic over mac/win/linux/wasm, though we need to handle system linkers differently.

This change also enables dx to operate as a faster linker allowing sub-second (in many cases, sub 200ms) incremental rebuilds.

Todo:

  • Add logic to the devtools types and generic integration
  • Wire up desktop
  • Rework existing hot-reload engine to be properly compatible
  • Remove old binaries
  • Wire up iOS
  • Wire up macOS
  • Wire up Android
  • Wire up Linux
  • Wire up wasm
  • Wire up windows
  • Wire up server
  • clean up app/server impl (support more than 2 exe-s in prep for dioxus.json)
  • fix integration with old hot-reload engine

Notes:

This unfortunately brings a very large refactor to the build system since we need to persist app bundles while allowing new builds to be "merged" into them. I ended up flattening BuildRequest + Bundle together and Runner + Builder together since we need knowledge of previous bundles and currently running processes to get patching to work properly.

@jkelleyrtp jkelleyrtp changed the title Binary patching rust hot-reloading Binary patching rust hot-reloading, sub-second rebuilds Feb 25, 2025
@jkelleyrtp jkelleyrtp changed the title Binary patching rust hot-reloading, sub-second rebuilds Binary patching rust hot-reloading, sub-second rebuilds, independent server/client hot-reload Feb 25, 2025
@jkelleyrtp
Copy link
Member Author

jkelleyrtp commented Mar 18, 2025

progress update

I've migrated everything over from ipbp so now anyone should be able to run the demos in macOS/iOS. Going to add linux + android support next.

I've been tinkering with the syntax for subsecond a bit and am generally happy now with the API. You can wrap any closure with ::call() and that closure is now "hot":

pub fn launch() {
    loop {
        std::thread::sleep(std::time::Duration::from_secs(1));
        subsecond::call(|| tick());
    }
}

fn tick() {
    println!("boom!");
}

If you need more granular support over "hot" functions then you'll want to use ::current(closure) which gives you a HotFn with extra flags and methods for running a callback. It also lets you run closures which are FnOnce since ::call() currently does not.

::call() taking an FnMut is meant to provide an "unwind" point that our assembly-diffing logic can bounce up to by emitting panics. This is meant to support cases where you might add a field to a struct and need to "rebuild" the app from a higher checkpoint (aka re-instancing).

For example, a TUI app with some state:

struct App {
    should_exit: bool,
    temperatures: Vec<u8>,
}

might implement a "run" method that calls subsecond:

    fn run(&mut self, terminal: &mut DefaultTerminal) -> Result<()> {
        while !self.should_exit {
            subsecond::call(|| self.tick(terminal))?;
        }
        Ok(())
    }

If the struct's size/layout change, then we want to rebuild the app from scratch. Alternatively, we could somehow migrate it, which is out of scope for this PR, but implementations can be found in libraries like dexterous. We might end up taking an approach that unwinds the stack to the app's constructor and then copies it to a new size/layout, merging the new fields in. TODO on what this should look like.

Here's a vide of the tui_demo in the subsecond_harness crate:

subsecond-tui.mp4

runtime integration

Originally I wanted to use LLDB to drive the patching system - and we still might need to for proper "patching" - but I ran into a bunch of segfaults and LLDB crashes when we sigstopped the program in the middle of malloc/demalloc. Apparently there's a large list of things you cannot do when a program is sigstopped and using allocators is one such thing. We could look into using a dedicated bump allocator and continue using lldb, but for now I have an adapter build on websockets. We might end up migrating to a shared-memory system such that the HOST and DUT can share the patch table freely. The challenge with these approaches is that they're not very portable and websockets seem to be available literally everywhere.

zero-link / thinlink

One cool thing spun out of this work is "zerolink" (thinlink maybe?): our new approach for drastically speeding up rust compile times by automatically using dynamic linking. This is super useful for tests, benchmarks, and general development since we can automatically split your workspace crates from your "true" dependencies and skip linking your dependencies on every build.

This means you can turn up opt levels and leave debug symbols (two things that generally slow down builds) which incurs a one-time cost and then continuously dynamically link your incremental object files against the dependencies dylib. Most OSes support a dyld_cache equivalent which keeps your dependencies.dylib memory mapped and cached between invocations which also greatly speed up launch times.

ZeroLink isn't really an "incremental linker" per se, but it behaves like one thanks to Rust's incremental compile system. In spirit it's very similar to marking a crate as a dylib crate in your crate graph (see bevy/dynamic) but it doesn't require you to change any of your crates and it supports WASM.

dx is standalone

I wanted to use zerolink with non-dioxus projects, so this PR also makes dx a standalone rust runner. You can dx run your project and dioxus does not need to be part of your crate graph for it to work. This lets us bootstrap dx by running dx with itself and making it easy to update the TUI without fully rebuilding the CLI.

wasm work

WASM does not support dynamic linking so we need to mess with the binaries ourselves. Fortunately this is as simple as linking the deps together to a relocatable object file, lifting the symbols into the export table, and recording the element segments.

When the patches load they need two things

  • addresses within the ifunc table for ifuncs
  • imports from the main module

unfortunately the wasm-bindgen pass runs ::gc so I don't think there's any cool combination of flags we can use against wasm-ld to do this for us automatically. However, all the work we put into wasm_split really comes in handy.

What's left

There's three avenues of work left here:

  • Propagating the change graph through the HotFn points
  • More platform support (windows, wasm, server_fn)
  • Bugs (better handling of statics, destructors, renaming symbols, changing signatures, and dioxus integration like Global)

I expect Windows + WASM to take the longest to get proper support and will prioritize that over propagating the change graph. Dioxus can function properly without a sophisticated change graph, but other libraries will want the richer detail available.

@DrewRidley
Copy link
Contributor

Awesome work here! I might recommend adding .arg("-Zcodegen-backend=cranelift") as an optional user-facing argument when hot reloading.

I found on my M3 Pro Macbook it brings down the average times from ~600ms to ~300ms. The backend ships as a cargo component now so it should be a drop in replacement for desktop or possibly mobile platforms.

@jkelleyrtp
Copy link
Member Author

jkelleyrtp commented Mar 19, 2025

Awesome work here! I might recommend adding .arg("-Zcodegen-backend=cranelift") as an optional user-facing argument when hot reloading.

I found on my M3 Pro Macbook it brings down the average times from ~600ms to ~300ms. The backend ships as a cargo component now so it should be a drop in replacement for desktop or possibly mobile platforms.

Wow that's incredible!

On my M1 I've been getting around 900ms on the dioxus harness with default dev profile and then 500-600 with the subsecond-dev profile:

[profile.subsecond-dev]
inherits = "dev"
debug = 0
strip = "debuginfo"

I'll add the cranelift backend option and then report back. In the interim you can check to see if that profile speeds up your cranelift builds at all.

I did some profiling of rustc and about 100-300ms is spent copying incremental artifacts on disk. That's pretty substantial given the whole process is like 500ms. Hopefully this is improved here:

rust-lang/rust#128320

I would like to see that time drop to 0ms at some point and then we'd basically have "blink and you miss it" hotpatching.

@DrewRidley
Copy link
Contributor

DrewRidley commented Mar 19, 2025

I tried the profile and with or without it, its consistently ~300ms on my Mac. When doing self-profile I noticed that register allocation takes a huge portion of the total time spent

I discovered this (https://docs.wasmtime.dev/api/cranelift_codegen/settings/enum.RegallocAlgorithm.html) which might help if its been backported to codegen_clif as a flag or option.

That seemed to be a fluke in testing and actually the remaining time is mostly incremental cache related file IO. Not sure how much can be done about that.

Regardless, this is super exciting work, let me know if there's any other way I can help.

@jkelleyrtp
Copy link
Member Author

jkelleyrtp commented Mar 22, 2025

I switched to a slightly modified approach (lower level, faster, more reliable, more complex).

This is implemented to work around a number of very challenging android issues

  • pointer tagging
  • mte
  • linker namespaces
  • read/write permissions

Since this is more flexible it should work across linux and windows (android and linux are the same). Last target is wasm.

Here's the android demo:

hotpatch-android.mp4

iOS:

ios-binarypatch.mp4

@DogeDark
Copy link
Contributor

DogeDark commented Mar 26, 2025

I'm trying this out on Linux Mint rustc 1.85.1 and am encountering a few issues:

  • Had to comment out tests at the end of packages/subsecond/subsecond-cli-support/src/wasm.rs since I don't have the files that are include_bytes!

Desktop app had these issues:

  • asset files weren't included
  • asset! macros panic on rebuilds (hot patches?).
  • When it successfully builds, it writes the patch and outputs symbol not found _Unwind_Resume and symbol not found memcpy. and build fails build panicked, not yet implemented

Trying wasm and first build fails rust-lld with note: rust-lld: error: unknown argument: -Wl,--whole-archive,-Wl,--no-gc-sections,-Wl,--export-all and a bunch of warnings: warning x archive member y is neither wasm object file nor llvm bitcode

Edit: looks like wasm might not have been ready yet.

@jkelleyrtp
Copy link
Member Author

jkelleyrtp commented Mar 27, 2025

there were some issues with wasm that have now been fixed.

It should be possible to run the mini-cli I built for the harness with

RUST_LOG=info cargo run --package subsecond-cli -- --target wasm32-unknown-unknown

currently we aren't running the manganis step on the patch to register new assets. we need to do that. I haven't tried it with asset!() at all so I guess there might be other panics too, potentially related to hashes changing.

Also the todo!() on patching on x86 is expected to be not working right now since I've only tinkered with ld64 to know what args to pass to it:

let res = match target.architecture {
// usually just ld64 - uses your `cc`
target_lexicon::Architecture::Aarch64(_) => {
// todo: we should throw out symbols that we don't need and/or assemble them manually
Command::new("cc")
.args(object_files)
.arg("-Wl,-dylib")
// .arg("-Wl,-undefined,dynamic_lookup")
// .arg("-Wl,-export_dynamic")
.arg("-arch")
.arg("arm64")
.arg("-o")
.arg(&output_location)
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.output()
.await?
}
target_lexicon::Architecture::Wasm32 => {
let table_base = 2000 * (aslr_reference + 1);
let global_base = (((aslr_reference) * (65536 * 3)) + (2097152)) as i32;
tracing::info!(
"using aslr of table: {} and global: {}",
table_base,
global_base
);
Command::new(wasm_ld().await.unwrap())
.args(object_files)
.arg("--import-memory")
.arg("--import-table")
.arg("--growable-table")
.arg("--export")
.arg("main")
.arg("--export-all")
// .arg("--export=__heap_base")
// .arg("--export=__data_end")
// .arg("--allow-undefined")
// .arg("--unresolved-symbols=ignore-all")
// .arg("--relocatable")
// .arg("-z")
// .arg("stack-size=1048576")
.arg("--stack-first")
.arg("--allow-undefined")
.arg("--no-demangle")
.arg("--no-entry")
.arg("--emit-relocs")
.arg(format!("--table-base={}", table_base))
.arg(format!("--global-base={}", global_base))
.arg("-o")
.arg(&output_location)
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.output()
.await?
}
_ => todo!(),
};

I'm assuming the error about memcpy is due to a similar reason (not passing linker flags for x86 at all):

Though I think some of these are actually fixed with dx itself though wasm isn't properly integrated yet.

There are some pretty bag bugs in wasm that seem to be squashed now. Our state-preservation engine is naive right now (invalidates every hotpatch call) so we need to wire it up to the object file diffing logic from here:

fn diff(&self) -> Result<ObjectDiffResult<'_>> {

I was hardcoding the wasm linker before which I shouldn't be doing anymore. The fact that those args are being rejected seems to be a mismatch in linker or us incorrectly generating the stub object file (probably due to a panic or early bail).

Here's a little video of wasm now with all its bugs fixed:

hotpatch-wasm-complete.mp4

@jkelleyrtp jkelleyrtp marked this pull request as ready for review March 29, 2025 01:19
@jkelleyrtp jkelleyrtp requested a review from a team as a code owner March 29, 2025 01:19
@jkelleyrtp jkelleyrtp marked this pull request as draft March 29, 2025 01:20
@jrmoulton
Copy link

jrmoulton commented Apr 3, 2025

rust-lang/rust#139265

I'm not sure if this has already been noted, but if not: the usage of __rust_alloc will break in the upcoming stable rust release as that symbol will be mangled.

@jkelleyrtp jkelleyrtp merged commit d976ca1 into main May 7, 2025
17 checks passed
@jkelleyrtp jkelleyrtp deleted the jk/binary-patch branch May 7, 2025 06:09
@jkelleyrtp
Copy link
Member Author

Two and half months for Rust hot patching isn’t so bad :)

Hoping to get an alpha out for 0.7 tomorrow!

@RGBCube
Copy link

RGBCube commented May 8, 2025

Will there be guides to integrate this into other Rust programs, such as Bevy / Leptos, or even just axum projects?

I'd love to use a generic-ish library and get magical hot reloading in Rust

AnteDeliria pushed a commit to AnteDeliria/dioxus that referenced this pull request Jun 2, 2025
…server/client hot-reload (DioxusLabs#3797)

Enable runtime rust hot patching.

* roll back wry to 0.45

* cleanup rustc

* wip

* wip: fix client/server

* get the subsecond cli thing working again

* back to list of builds but with accessors

* migrate some methods to workspace, clean up impl a bit

* pass mode through build context

* use a build context, similar to cargo/rustc

* merge BuildUpdate and HandleUpdate into BuilderUpdate

* Move more resolution of args into runner

* migrate some methods to runner

* hoist out fullstack

* fix build request resolver

* yay request is cleaned up

* fixup resolution a bit

* spawn off the build!

* re-wire android autodetection

* re-wire android autodetection

* wire back up a few more things

* re-wire tooling verification

* I think it's mostly in the right condition for regular app

* add depinfo parser

* yay okay works with regular apps again

* full rebuilds are back

* rewire file wtcher

* wire patch!

* yayyyyyy patching works and is really fast

* yay android works

* clean up stdout piping

* yayyy memap on android so we don't need root

* create a static wasm jump table

* wip: global offset table for wasm

* wip... thinking about making the patches relocatable

* wip: customized relocation

* YESSSSSS RELOCATABLE WASM

* clean up impl a bit

* lil bit more cleanup

* lil bit more

* lil bit more cleanup

* lil bit more cleanup, support 32 bit platforms

* sick, wasm is completely pic

* hmmmm not quite working yet

* woooo, patches loading from rust+wasm

* integrated into the cli 😎

* condense a bit

* cleaned up request a bit

* bust fingerprints

* Make the file diffing logic a bit more sensible

* still working through a few issues, server launches but is acting weird

* fix merge conflict

* more merge issues

* remove stuff we don't want anymore

* revert name change

* wip: server not working anymore :(

* split apart "dioxus fullstack" into "dioxus server" and "dioxus web"

* fixup a few more compile errors

* grrr opening server

* simultaneous frontend and backend patching

* use rustc wrapper

* wip: passing assets to jumptable

* migrate project-like examples

* patchy  patchy server fns!

* rollback some random changes

* unwind the js files

* rollback hash

* no need for patch

* more cleanups

* lil bit more cleanup, remove some old cruft

* allow patching when wasm webpage is closed

* tiny bit more robust

* lil bit of clean up

* clean up ws code

* bit more clean ups

* condense

* undo file moves

* move back other project

* move other project

* migrate out harness and janky CLI

* fix compile

* anonymize some files, heavily document others

* lots more documentation, for posterity!

* ton more docs

* clean up the cli a bit more

* more ws cleanup

* more cleanup

* migrate build id to cli-config

* add command file handling

* random small cleanups

* wip....

* wip....

* fix: use workspace for krate discovery

* fix panic logging on serve by placing it *after* the logs

* reorder logging to be sensible on success

* bring back swc

* ws cruft

* small patches to relocs in loader

* bump krates

* hoist main funcs into the ifunc table, fixing wasm

* fix tui output for long lines

* add more linker logging

* wow, incredible, we build fat binaries, holy heckkkkk

* fix args for frontend/backend

* small cleanups

* properly send along build IDs to clients

* fix workspace compiles

* clean up jump table work a bit

* clean up logging a bit

* open existing

* fixup some commands

* open existing browser integration

* wire up the new dx serve client/server syntax

* fixup `dx run` command

* bring back some old functionality

* fix serverfn feature sets

* remove toast animation and simplify its code a bit

* less intrusive toast

* dont change build status when patching on web

* add proper cache busting to vcomponent

* clean up the patch file a bit

* more lints/checks cleaned up

* go back to TargetArgs but a BuildTargets resolved struct

* clean up more nits

* use an atomicptr for jumptable

* fix interaction with initial web load + suspense

* don't run asset system on wasm/bindgen js in dev

* reduce blast radius of PR

* use profile to determine if we're in release

* cleanup prod html template

* cleanup profiles

* fix feature resolution of fullstack

* if fullstack is explicitly false, don't use it

* light cleanups

* drop carg config2

* pass along linker args properly

* make workspace examples compile with dx again

* fewer unwraps and better error loggign

* bit more error handlign

* small cleanup

* drive-by cleanups

* use queries instead of initialize for aslr stuff

* fix hotpatch with higher opt level

* fix aslr bug from multiple clients

* fix merge conflict

* fix typos

* clippy

* fix miscompile

* fix doctest

* properly rollback wry to 0.45

* fix markdown path issues, other test issues

* fix test in router

* fix release mode test

* fix some more tests, clean up a few more items

* use fnptr instead of typeid

* clean up memozation via subsecond in core

* use "main" as sentinel

* fix imports and re-exports

* get off __rust_alloc for aslr offset detection

* wip

* wip... fixing unoptimized web

* hmmmm mmmm

* close,ish, still missing the wasmbindgen dynaimc imports

* aha! full wasm works now

- disable externref when compilng patch modules
- manually link wasm instrinsics
- displace the __wbindgen_placeholder__ table with the synthesized wbg imports

what an awful pain.

There might be a few items missing, so will test against the docsite shortly

* remove failing test

* hmmmm things aren't perfect

* from what I can tell, relocatable wasm just doesn't work with wasm-bindgen

sad

rustwasm/wasm-bindgen#1420

* AI IS LITERALLY THE WORST

* preserve wasm snippets across compiles

* IT HOTPATCHES THE DOCSITE, WE ARE LIVING IN THE FUTURE

* remove the macos cfg-out

* fix: need to allocate 1 more page

* fix url issue

* properly span props

* properly fill in extra symbols

* light cleanups and good docs

* delete cruft, better docs

* More cleanups and simplifications

* clippy, typos

* remove harnesses

* wasm-bindgen error logging

* clean up deps

* remove cruft

* proper priv/pub

* use ifuncs for all env imports

* cruft

* use par_iter, better output

* implement a cache for the base module

* delete custom sections

* cache the main native module too

* add par iter in a few more places

* clippy, map ident

* better logging and status bars

* wip: some small windows support

* wip more windowss

* fix weird regression

* implement windows jump table

* small fixes for windows

* better windows support

* windows support pt2

* rust lld is not msvc

* whole archive the rlibs

* ignore windows and sys crates

* logging

* ohhhhhh, pop out dlls

* dedupe

* thANK GOD FAT LINKING WORKS ON WINDOWS

* hoist caching

* implement patch based on cache

* fix location for pdb

* pass data symbols, correct object

* add all symbol for windows

* add log

* treat none rva as undefined

* whoops

* handle windows calling convention

* dont use relocated arm64 stub windows

* not all are text?

* WINDOWS FUNCTION PATCHING WORKS

but issue with statics?

* use our own entry

* handle windows imp symbols

* add log

* jump to the real symbol not the layout

* whoops

* hotpatching windows works completely holy heck yay

* drop linker args crate

* disable ssg playwright for now

* light cleanups

* some cleanups / preps:
- bump to serverfn 0.8.2
- fix playwright test
- fix clippy

* use fix from DioxusLabs#4042

* bring in other fix for liveview

* hoist some configuration

* use __aslr_reference instead of aslr_reference

* add better errors to patcherror

* fix android

* panic if patching fails

* clean up tomls

* wip: fixing wasm compile for server

* fullstack wasm again

* fix cfg

* fixup feature detection

* fix issue with features

* fix playwright!

* add barebones template test harness

* fix compatibility between the two hotreload engines

* fix preldue of server

* fix check for wasm-opt

* small fixes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cli Related to the dioxus-cli program
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants